Goto

Collaborating Authors

 bius transformation



NeuralIsometries: TamingTransformationsforEquivariantML

Neural Information Processing Systems

While finite-dimensional irreducible representations (IRs) are attractive building blocks for equivariance due to their computationally exploitable structure, theyoften don'texist fornon-compact groups, precluding generalizations to most non-linear symmetries, let alone those ill-modeled by groups.


Möbius Transformation for Fast Inner Product Search on Graph

Neural Information Processing Systems

We present a fast search on graph algorithm for Maximum Inner Product Search (MIPS). This optimization problem is challenging since traditional Approximate Nearest Neighbor (ANN) search methods may not perform efficiently in the non-metric similarity measure. Our proposed method is based on the property that Möbius transformation introduces an isomorphism between a subgraph of l^2-Delaunay graph and Delaunay graph for inner product. Under this observation, we propose a simple but novel graph indexing and searching algorithm to find the optimal solution with the largest inner product with the query. Experiments show our approach leads to significant improvements compared to existing methods.



Hyperspherical Variational Autoencoders Using Efficient Spherical Cauchy Distribution

arXiv.org Machine Learning

We propose a novel variational autoencoder (VAE) architecture that employs a spherical Cauchy (spCauchy) latent distribution. Unlike traditional Gaussian latent spaces or the widely used von Mises-Fisher (vMF) distribution, spCauchy provides a more natural hyperspherical representation of latent variables, better capturing directional data while maintaining flexibility. Its heavy-tailed nature prevents over-regularization, ensuring efficient latent space utilization while offering a more expressive representation. Additionally, spCauchy circumvents the numerical instabilities inherent to vMF, which arise from computing normalization constants involving Bessel functions. Instead, it enables a fully differentiable and efficient reparameterization trick via Möbius transformations, allowing for stable and scalable training. The KL divergence can be computed through a rapidly converging power series, eliminating concerns of underflow or overflow associated with evaluation of ratios of hypergeometric functions. These properties make spCauchy a compelling alternative for VAEs, offering both theoretical advantages and practical efficiency in high-dimensional generative modeling.


Clustering in hyperbolic balls

arXiv.org Artificial Intelligence

The idea of representations of the data in negatively curved manifolds recently attracted a lot of attention and gave a rise to the new research direction named {\it hyperbolic machine learning} (ML). In order to unveil the full potential of this new paradigm, efficient techniques for data analysis and statistical modeling in hyperbolic spaces are necessary. In the present paper rigorous mathematical framework for clustering in hyperbolic spaces is established. First, we introduce the $k$-means clustering in hyperbolic balls, based on the novel definition of barycenter. Second, we present the expectation-maximization (EM) algorithm for learning mixtures of novel probability distributions in hyperbolic balls. In such a way we lay the foundation of unsupervised learning in hyperbolic spaces.


Reviews: Möbius Transformation for Fast Inner Product Search on Graph

Neural Information Processing Systems

This paper proposed a new algorithm for max-inner-product-search, a widely encountered problem in all kinds of applications. Though seemingly similar to ANN problem, MIPS is different in terms of theory and algorithm design, so that the massive amount of KNN methods cannot apply directly. The authors extend the well-known Delaunay graph type of methods in ANN to MIPS and provide both theoretical discussion and experimental evidence to show the advantage of the proposed method. I find this paper to be interesting, and would like the authors to consider my following comments: 1. For assumption 1, I'm a little confused.


Möbius Transformation for Fast Inner Product Search on Graph

Neural Information Processing Systems

We present a fast search on graph algorithm for Maximum Inner Product Search (MIPS). This optimization problem is challenging since traditional Approximate Nearest Neighbor (ANN) search methods may not perform efficiently in the non-metric similarity measure. Our proposed method is based on the property that Möbius transformation introduces an isomorphism between a subgraph of l 2-Delaunay graph and Delaunay graph for inner product. Under this observation, we propose a simple but novel graph indexing and searching algorithm to find the optimal solution with the largest inner product with the query. Experiments show our approach leads to significant improvements compared to existing methods.


Expanding Expressivity in Transformer Models with M\"obiusAttention

arXiv.org Artificial Intelligence

Attention mechanisms and Transformer architectures have revolutionized Natural Language Processing (NLP) by enabling exceptional modeling of long-range dependencies and capturing intricate linguistic patterns. However, their inherent reliance on linear operations in the form of matrix multiplications limits their ability to fully capture inter-token relationships on their own. We propose MöbiusAttention, a novel approach that integrates Möbius transformations within the attention mechanism of Transformer-based models. Möbius transformations are non-linear operations in spaces over complex numbers with the ability to map between various geometries. By incorporating these properties, MöbiusAttention empowers models to learn more intricate geometric relationships between tokens and capture a wider range of information through complex-valued weight vectors. We build and pre-train a BERT and a RoFormer version enhanced with MöbiusAttention, which we then finetune on the GLUE benchmark. We evaluate empirically our approach against the baseline BERT and RoFormer models on a range of downstream tasks. Our approach compares favorably against the baseline models, even with smaller number of parameters suggesting the enhanced expressivity of MöbiusAttention. This research paves the way for exploring the potential of Möbius transformations in the complex projective space to enhance the expressivity and performance of foundation models. At the heart of their success lies the attention mechanism (Vaswani et al., 2017), a powerful tool that enables them to identify relationships between different parts of the data, be it words in a sentence or image patches in a scene. Despite their remarkable impact, current transformers face limitations. A key constraint is the inherent linearity of the attention mechanism, which primarily relies on weights learned through linear transformations, matrix multiplications, and the softmax function. While softmax is a non-linear operation, it is only used to produce a probability distribution over the elements signaling their relative importance in comparison to the others, and not to introduce non-linear interdependencies. Predominantly linear operations restrict the ability of models to capture complex linguistic dependencies, leading to potential information loss within each attention layer as shown by recent research (Zhang, 2023). Figure 1: Various Möbius transformations: Each sub-figure shows flows from a single point after successive transformations. Elliptic Möbius has two fixed points at the centers of two circular flows.


Conformally Natural Families of Probability Distributions on Hyperbolic Disc with a View on Geometric Deep Learning

arXiv.org Artificial Intelligence

We introduce the novel family of probability distributions on hyperbolic disc. The distinctive property of the proposed family is invariance under the actions of the group of disc-preserving conformal mappings. The group-invariance property renders it a convenient and tractable model for encoding uncertainties in hyperbolic data. Potential applications in Geometric Deep Learning and bioinformatics are numerous, some of them are briefly discussed. We also emphasize analogies with hyperbolic coherent states in quantum physics.